Evaluating the Unification of Multiple Information Retrieval Techniques into a News Indexing Service
نویسندگان
چکیده
While online information sources are rapidly increasing in amount, so does the daily available online news content. Several approaches have being proposed for organizing this immense amount of data. In this work we explore the integration of multiple information retrieval techniques, like text preprocessing, n-grams expansion, summarization, categorization and item/user clustering into a single mechanism designed to consolidate and index news articles from major news portals from around the web. Our goal is to allow users to seamlessly and quickly get the news of the day that are of appeal to them via our system. We show how, the application of each one of the proposed techniques gradually improves the precision results in terms of the suggested news articles for a number of registered system users and how, aggregately, these techniques provide a unified solution to the recommendation problem.
منابع مشابه
Content Based Radiographic Images Indexing and Retrieval Using Pattern Orientation Histogram
Introduction: Content Based Image Retrieval (CBIR) is a method of image searching and retrieval in a database. In medical applications, CBIR is a tool used by physicians to compare the previous and current medical images associated with patients pathological conditions. As the volume of pictorial information stored in medical image databases is in progress, efficient image indexing and retri...
متن کاملNews-Oriented Automatic Chinese Keyword Indexing
In our information era, keywords are very useful to information retrieval, text clustering and so on. News is always a domain attracting a large amount of attention. However, the majority of news articles come without keywords, and indexing them manually costs highly. Aiming at news articles’ characteristics and the resources available, this paper introduces a simple procedure to index keywords...
متن کاملتأملاتی بر نمایه سازی تصاویر: یک تصویر ارزشی برابر با هزار واژه
Purpose: This paper presents various image indexing techniques and discusses their advantages and limitations. Methodology: conducting a review of the literature review, it identifies three main image indexing techniques, namely concept-based image indexing, content-based image indexing and folksonomy. It then describes each technique. Findings: Concept-based image indexing is te...
متن کاملNews-Oriented Keyword Indexing with Maximum Entropy Principle
In our information era, keywords are very useful to information retrieval, text clustering and so on. News is always a domain attracting a large amount of attention. Aiming at news documents' characteristics and the resources available, this paper proposes to use Maximum Entropy (ME) model to conduct automatic keyword indexing. The focus of ME-based keyword indexing is how to obtain all the can...
متن کاملPublic Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014